Executive Summary

This report presents a comprehensive analysis of trends and patterns in students’ applications for postgraduate programs. We looked closely at a number of the variables affecting the application process using R Studio. The results offer significant perspectives for educational establishments seeking to improve their postgraduate admissions tactics.

Table of Contents

1. Introduction

1.1 Overview

1.2 Questions we imposed

1.3 Scope of Analysis

2. Explaining the Dataset

2.1 Overview of Dataset

2.2 Problems with the dataset

3. Data Scraping and Methodology

3.1 Data Sources

3.2 Variables Considered

3.3 Data Preprocessing

5. Conclusion

6. References

1.Introduction

1.1 Overview

The increasing competitiveness of postgraduate programs has made the process of choosing a good program at a reputable university difficult and requires a thorough understanding of application trends and patterns. This analysis aims to provide an insight on academic institutions to help applicants in their admission processes.

1.2 Questions We imposed

  • What Majors are students preferring?
  • How many students prefer changing their majors from UnderGrad to PostGrad?
  • Which Universities are preferred for a particular major?
  • What are the Admit/Reject chances?
  • Does CGPA really matters?
  • How much does the factors like writing papers and and having work experience affect the Admit chances?

1.3 Scope of Analysis

The study covers data of 100 students from IIT Bombay, IIT Delhi, IIT Madras, IIT Kanpur, IIT Kharagpur, IIT Roorkee, IIT Guwahati with different Degrees applying for Post-Graduation in different foreign Universities and then based on their CPI/CGPA, UG College and Degree, GRE and TOEFL/IELTS scores and work experience whether their application got accepted or rejected.

2. Explaing the dataset

2.1 Overview of the dataset

Our project consists of the data of various students from different Colleges and Degrees applying for Post-Graduation in Foreign Universities and then based on their CPI/CGPA, UG College and Degree, GRE and TOEFL scores and work experience whether their application got accepted or rejected.As of now, we have scraped data of these colleges : IIT Bombay, IIT Delhi, IIT Madras, IIT Kanpur, IIT Kharagpur, IIT Roorkee, IIT Guwahati For each college, we have scraped data of 100 students.

2.2 Problems with the dataset

The analysis of the scraped dataset revealed notable challenges, primarily centered around data completeness and potential sample bias. A significant concern emerged from the observation that a mere 2% of applicants had provided IELTS scores and was completely not assigned to any particular university we got the data for which is why removing it did not affect the dataset.

The data with IELTS score column :

Moreover, it was evident that not all rejected applicants had entered their data, creating an incomplete representation of the entire applicant pool introducing the possibility of selective sample bias. As such, caution has been be exercised when drawing conclusions or making inferences based on this dataset, and efforts to address these data gaps should be prioritized to enhance the robustness and reliability of subsequent analyses. We have tried to solve the problems in the dataset by preprocessing it.

3. Data Scraping and Methodology

3.1 Data Sources

The dataset comprises application records from the Website: https://admits.fyi/

3.2 Data Scraping

The libraries we used for scrapping are:

  • tidyverse

  • rvest

  • RSelenium

  • netstat

We used Chrome driver for web scraping to automate interactions with Chrome Browser utilizing the RSelenium library along with rvest in Rstudio.

We used netstat for getting information about network connections, routing tables, interface statistics, and other networking-related details.

3.3 Categorical Variables

University

The university in which the applicant has applied for the PG program.

Status

Status denotes the acceptance.

Target Major

The major for which the applicant has applied.

Term

Academic semester in which the applicant has applied.

GRE

Score of GRE(Graduate Record Examination) consisting of score of three sections: Verbal reasoning, Quantitative reasoning and Analytical Writing.

TOEFL/IELTS

Score of TOEFL/IELTS of the applicant.

UG College

The college from which the applicant has completed their UG program.

UG Major

Under Graduate program of the applicant.

CGPA

Cumulative Grade Point Average representing the average of the grade points obtained in all courses.

Papers

Number of research papers written.

Work Ex

Work Experience of the applicant.

3.4 Data Preprocessing

  • We used dplyr library for preprocessing.

  • Removed the NULL rows from the scrapped data.

  • Scaled all the CGPAs to 10.

  • Made course baskets for target major courses:

    • Electrical & Computer Engineering, Computer Science, Computer Engineering, Computing Science, Applied Computing, Software Engineering, Information Management and Systems, Cyber Security, Computational Science & Engineering, Information Technology Management, Computer & Information Science, Information Technology, Computer Networks, Big Data, Information Systems under Computer.

    • Data Science, Data Analytics, Artificial Intelligence, Machine Learning, Robotics, Computational and Mathematical Engineering, Bioinformatics, Data Science and Business Analytics under AI_ML_DS

    • Electrical Engineering, EECS, Telecommunications Engineering under Electrical

    • Mechanical Engineering, Industrial Engineering, Industrial and Systems Engineering under Mechanical

    • Chemical Engineering, Chemical and Petroleum Engineering under Chemical

    • Civil Engineering, Civil & Environmental Engineering under Civil

    • Finance, Business Analytics, Business Analytics and Information Syste, MBA, Business Analytics Flex, Business Intelligence and Analytics under Business

    • Engineering Management, Information Management, Supply Chain Management, Management Science and Engineering under Management

  • Separated the TOEFL and IELTS score using appropriate condition and then removed the IELTS column.

  • Changed the names of the following columns:

    • University to UNIVERSITY.NAME
    • Target Major to TARGET.MAJOR
    • Divided GRE into GRE.Q ,GRE.V ,GRE.AWA and GRE.TOTAL
    • TOEFL/IELTS to TOEFL
    • UG College to UG.COLLEGE
    • UG Major to UG.MAJOR
    • Work Ex to WORK.EXPERIENCE.M

Preprocessed data table looks like this :

5. Conclusion

6. References